Distributed Clustering on Graphs

نویسندگان

  • Maria-Florina Balcan
  • Steven Ehrlich
  • Yingyu Liang
چکیده

This paper provides new algorithms for distributed clustering for two popular center-based objec-tives, k-median and k-means. These algorithms have provable guarantees and improve communicationcomplexity over existing approaches. Following a classic approach in clustering by [13], we reduce theproblem of finding a clustering with low cost to the problem of finding a ‘coreset’ of small size. Weprovide a distributed method for constructing a global coreset which improves over the previous methodsby reducing the communication complexity, and which works over general communication topologies.Experiment results on large scale data sets show that this approach outperforms other coreset-baseddistributed clustering algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Community Base on Web Graph Clustering

Search Pointers organize the main part of the application on the Internet. However, because of Information management hardware, high volume of data and word similarities in different fields the most answers to the user s’ questions aren`t correct. So the web graph clustering and cluster placement in corresponding answers helps user to achieve his or her intended results. Community (web communit...

متن کامل

Entropy-based Consensus for Distributed Data Clustering

The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...

متن کامل

Hearing the clusters in a graph: A distributed algorithm

We propose a novel distributed algorithm to cluster graphs. The algorithm recovers the solution obtained from spectral clustering without the need for expensive eigenvalue/vector computations. We prove that, by propagating waves through the graph, a local fast Fourier transform yields the local component of every eigenvector of the Laplacian matrix, thus providing clustering information. For la...

متن کامل

Hearing the clusters of a graph: A distributed algorithm

We propose a novel distributed algorithm to cluster graphs. The algorithm recovers the solution obtained from spectral clustering without the need for expensive eigenvalue/vector computations. We prove that, by propagating waves through the graph, a local fast Fourier transform yields the local component of every eigenvector of the Laplacian matrix, which are used to cluster graphs. For large g...

متن کامل

graph2vec: Learning Distributed Representations of Graphs

Recent works on representation learning for graph structured data predominantly focus on learning distributed representations of graph substructures such as nodes and subgraphs. However, many graph analytics tasks such as graph classification and clustering require representing entire graphs as fixed length feature vectors. While the aforementioned approaches are naturally unequipped to learn s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1306.0604  شماره 

صفحات  -

تاریخ انتشار 2013